The Human Development Index (HDI) is a statistic composite index of life expectancy, education (mean years of schooling completed and expected years of schooling upon entering the education system), and per capita income indicators, which are used to rank countries into four tiers of human development.
It emphasizes on capabilities of people and makes it as a criteria for assessing the development of a country. The data used here, is extracted from United Nations Development Programme : Human Development Reports. The data set show indicative data for countries with very high, high, moderate and low human development.
Project Details
Name: Analysis on Human Development Index 2019 (Group Project - ETC5510)
Objective: To analyze the data set used, by answering four research questions.
It compares the countries for HDI value, HDI rank(2019, 2018)and SDGs (Sustainable Development Goals) 3,4,5 (2019), where SDG 3 = Life expectancy at birth, SDG 4.3 = Expected years of schooling, SDG 4.4 = Mean years of schooling, SDG 8.5 = Gross national income (GNI) per capita.
Research Questions
To compute the summary statistics for each variable for every Human Development category.
To compare and contrast the ‘GNI columns’ for ‘very high’ and ‘medium development countries’ and to conduct an analysis to assess whether the very high development countries’ translate their income better than the ‘medium development countries’ in the areas of human development.
Calculate the gap between the ‘expected years of education’ and the ‘average years of education’ for countries with different levels of Human Development, to analyze if countries with high levels of human civilization are estimated to have high levels of expected education. Therefore, judge if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.
Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and examine their decline or increase in HDI rank from 2018 to 2019.
Variable Information
| HDI_rank_2019 |
| Country |
| HDI_Value |
| Life_expectancy |
| Expected_years_of_schooling |
| Mean_years_of_schooling |
| GNI_per_capita |
| GNI_rank_minus_HDI_rank |
| HDI_rank_2018 |
| Degree_of_Human_Development |
HDI_rank_2019: A composite index measuring average achievement in three basic dimensions of human development; a long and healthy life, knowledge and a decent standard of living.
Country : List of countries for which HDI statistics are calculated.
HDI_Value: Summary measure of average achievement in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living.
Life_expectancy_at_birth: Number of years a new-born infant could expect to live if prevailing patterns of age-specific mortality rates at the time of birth stay the same throughout the infant’s life.
Expected_years_of_schooling: No. of years of schooling that a child of school entrance age can expect to receive if prevailing patterns of age-specific enrollment rates persist throughout the child’s life.
Mean_years_of_schooling: Average number of years of education received by people ages 25 and older, converted from education attainment levels using official durations of each level.
GNI_per_capita: This is ’Gross National Income’ per capita. This is the aggregate income of an economy generated by its production and its ownership of factors of production, less the income paid for the use of factors of production owned by the rest of the world, converted to international dollars using PPP(Purchasing Power Parity) rates, divided by midyear population.
GNI_per_capita_rank_minus_HDI_rank: Difference in ranking by GNI per capita and by HDI value. A negative value means that the country is better ranked by GNI than by HDI value.
HDI_rank_2018: Ranking by HDI value for 2018, calculated using the same most recently revised data available in 2020 that were used to calculate HDI values for 2019.
Degree_of_Human_Development : The cutoff-points are HDI of less than 0.550 for low human development, 0.550–0.699 for medium human development, 0.700–0.799 for high human development and 0.800 or greater for very high human development.
Q1. To compute the summary statistics for each variable for every Human Development category.
| Variable | Degree_of_Human_Development | Minimum | Median | Mean | Maximum | SD |
|---|---|---|---|---|---|---|
| Expected_years_of_schooling | VERY HIGH HUMAN DEVELOPMENT | 12.04 | 16.12 | 16.14 | 21.95 | 1.82 |
| Expected_years_of_schooling | HIGH HUMAN DEVELOPMENT | 11.19 | 13.61 | 13.60 | 16.87 | 1.14 |
| Expected_years_of_schooling | MEDIUM HUMAN DEVELOPMENT | 8.28 | 11.60 | 11.50 | 13.72 | 1.13 |
| Expected_years_of_schooling | LOW HUMAN DEVELOPMENT | 5.01 | 9.70 | 9.30 | 12.66 | 1.87 |
| GNI_per_capita | VERY HIGH HUMAN DEVELOPMENT | 14428.80 | 39870.68 | 42929.79 | 131031.59 | 20854.39 |
| GNI_per_capita | HIGH HUMAN DEVELOPMENT | 5039.04 | 13009.07 | 13184.34 | 26903.25 | 4763.56 |
| GNI_per_capita | MEDIUM HUMAN DEVELOPMENT | 2253.35 | 4960.53 | 5694.22 | 13944.13 | 2682.41 |
| GNI_per_capita | LOW HUMAN DEVELOPMENT | 753.91 | 2132.96 | 2385.03 | 5689.35 | 1284.51 |
| HDI_Value | VERY HIGH HUMAN DEVELOPMENT | 0.80 | 0.88 | 0.88 | 0.96 | 0.05 |
| HDI_Value | HIGH HUMAN DEVELOPMENT | 0.70 | 0.74 | 0.75 | 0.80 | 0.03 |
| HDI_Value | MEDIUM HUMAN DEVELOPMENT | 0.55 | 0.61 | 0.62 | 0.70 | 0.04 |
| HDI_Value | LOW HUMAN DEVELOPMENT | 0.39 | 0.48 | 0.49 | 0.55 | 0.05 |
| Life_expectancy | VERY HIGH HUMAN DEVELOPMENT | 72.58 | 80.21 | 79.45 | 84.86 | 3.29 |
| Life_expectancy | HIGH HUMAN DEVELOPMENT | 64.13 | 74.25 | 74.00 | 78.93 | 3.27 |
| Life_expectancy | MEDIUM HUMAN DEVELOPMENT | 58.74 | 69.66 | 68.43 | 76.68 | 4.72 |
| Life_expectancy | LOW HUMAN DEVELOPMENT | 53.28 | 62.05 | 61.95 | 69.02 | 4.37 |
| Mean_years_of_schooling | VERY HIGH HUMAN DEVELOPMENT | 7.28 | 12.14 | 11.62 | 14.15 | 1.47 |
| Mean_years_of_schooling | HIGH HUMAN DEVELOPMENT | 7.02 | 9.39 | 9.47 | 11.81 | 1.26 |
| Mean_years_of_schooling | MEDIUM HUMAN DEVELOPMENT | 4.07 | 6.50 | 6.51 | 11.10 | 1.52 |
| Mean_years_of_schooling | LOW HUMAN DEVELOPMENT | 1.64 | 3.93 | 4.25 | 6.76 | 1.37 |
Q2. To compare and contrast the GNI columns for very high and medium development countries and to conduct an analysis to assess whether the very high development countries translate their income better than the medium development countries in the areas of human development.
| Degree_of_Human_Development | min | q1 | median | q3 | max | mean | sd | n |
|---|---|---|---|---|---|---|---|---|
| MEDIUM HUMAN DEVELOPMENT | 7.7 | 8.3 | 8.5 | 8.9 | 9.5 | 8.6 | 0.4 | 37 |
| VERY HIGH HUMAN DEVELOPMENT | 9.6 | 10.2 | 10.6 | 10.9 | 11.8 | 10.6 | 0.5 | 66 |
| Log of GNI per capita | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | 4.67 | 3.26 – 6.07 | <0.001 |
| HDI_Value | 6.71 | 5.11 – 8.30 | <0.001 |
| Observations | 66 | ||
| R2 / R2 adjusted | 0.525 / 0.517 | ||
| Log of GNI per capita | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | 4.84 | 3.02 – 6.65 | <0.001 |
| HDI_Value | 6.01 | 3.08 – 8.94 | <0.001 |
| Observations | 37 | ||
| R2 / R2 adjusted | 0.331 / 0.312 | ||
There is a slop of 6.71 for Very High HDI Group and slop of 6.01 for Medium HDI Group.
In Fig 2.5, Left side graphs represents ‘Very High HDI Group’ and Right side graphs represents ‘Medium HDI Group’.
From the diagnostic plots, we see both model are normal distributions and are a good fit to the data.
However, from table 2.4 and table 2.5, we see that both model shows a low coefficient of determination. This suggests that GNI is not the only fundamental variables that determines the HDI value.
Therefore, we can conclude that although the very high HDI group performs better in both GNI and HDI than medium HDI group, there is only a medium correlation between GNI and HDI values to show that very high HDI translates income better.
| r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual | nobs |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.5247895 | 0.5173643 | 0.3244632 | 70.67715 | 0 | 1 | -18.34599 | 42.69197 | 49.26094 | 6.737687 | 64 | 66 |
| r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual | nobs |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.3311233 | 0.3120126 | 0.3627941 | 17.32654 | 0.0001946 | 1 | -13.95765 | 33.91531 | 38.74806 | 4.606685 | 35 | 37 |
We analyze if countries with high levels of human civilization are estimated to have high levels of expected education. Therefore, judge if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.
| Country | Degree_of_Human_Development | residual_years_of_schooling |
|---|---|---|
| Australia | VERY HIGH HUMAN DEVELOPMENT | 9.229639 |
| Bhutan | MEDIUM HUMAN DEVELOPMENT | 8.908858 |
| Benin | LOW HUMAN DEVELOPMENT | 8.788222 |
| Turkey | VERY HIGH HUMAN DEVELOPMENT | 8.495846 |
| Morocco | MEDIUM HUMAN DEVELOPMENT | 8.073170 |
| Uruguay | VERY HIGH HUMAN DEVELOPMENT | 7.909910 |
| Tunisia | HIGH HUMAN DEVELOPMENT | 7.905390 |
| Grenada | HIGH HUMAN DEVELOPMENT | 7.837766 |
| Timor-Leste | MEDIUM HUMAN DEVELOPMENT | 7.821715 |
| Burundi | LOW HUMAN DEVELOPMENT | 7.781347 |
| Nepal | MEDIUM HUMAN DEVELOPMENT | 7.742130 |
| Belgium | VERY HIGH HUMAN DEVELOPMENT | 7.724110 |
| Degree_of_Human_Development | Minimum RYS | Maximum RYS | Median RYS |
|---|---|---|---|
| VERY HIGH HUMAN DEVELOPMENT | 1.462530 | 9.229639 | 4.112699 |
| HIGH HUMAN DEVELOPMENT | -0.174090 | 7.905390 | 4.157615 |
| MEDIUM HUMAN DEVELOPMENT | 0.928100 | 8.908858 | 4.987901 |
| LOW HUMAN DEVELOPMENT | 0.496258 | 8.788222 | 5.108076 |
From fig 3.3 and fig 3.4, we observe that because there are more countries in ‘high HDI group’ than ‘low HDI group’, and only a small part of ‘high HDI group’ has excessive differences.
Q4.Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and examine their decline or increase in HDI rank from 2018 to 2019.
| No. of countries with rank difference |
|---|
| 112 |
There are more countries that have experienced a decline in rank from 2018 to 2019 (negative values on graph), than the countries whose rank has gone up.
Higher ‘negative values’ are observed for ‘high HDI group’ i.e. more decline of rank in High HDI group countries.
Maximum increase is observed among ‘high HDI group’ as well i.e out of countries with an increase in rank, maximum countries are from High HDI group.
| No. of countries with the same rank |
|---|
| 77 |
Analysis for Q4 Part B.
Most countries fall under Very High Human Development and few under Low Human Development.
HDI value, life expectancy, and mean year of schooling have more values as compared to ‘expected year of schooling’ and GNI. Also outlier data only exist in variable Expected year of schooling and GNI per capita.
Higher income will result in a high human development group, which means a higher HDI value.
Very High HDI Group translates income better.
It is accurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.
There are 112 countries that have a different HDI rank for 2018 and 2019. Out of which, more countries have had a decline in rank from 2018 to 2019, than an increase.
There are 77 countries that have the same rank for 2018 and 2019, out of which, Norway has maintained its 1st rank overall.
References
[1]Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686
[2] Hadley Wickham and Jim Hester (2020). readr: Read Rectangular Text Data. R package version 1.4.0. https://CRAN.R-project.org/package=readr
[3] Hadley Wickham and Evan Miller (2020). haven: Import and Export ‘SPSS’, ‘Stata’ and ‘SAS’ Files. R package version 2.3.1. https://CRAN.R-project.org/package=haven
[4] Hao Zhu (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra
[5] R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
[6] Frank E Harrell Jr, with contributions from Charles Dupont and many others. (2021). Hmisc: Harrell Miscellaneous. R package version 4.5-0. https://CRAN.R-project.org/package=Hmisc
[7] Baptiste Auguie (2017). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.3. https://CRAN.R-project.org/package=gridExtra
[8] Katherine Goode and Kathleen Rey (2019). ggResidpanel: Panels and Interactive Versions of Diagnostic Plots using ‘ggplot2’. R package version 0.3.0. https://CRAN.R-project.org/package=ggResidpanel
[9] Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.1.1. https://CRAN.R-project.org/package=cowplot
[10] Lüdecke D (2021). sjPlot: Data Visualization for Statistics in Social Science. R package version 2.8.7, <URL: https://CRAN.R-project.org/package=sjPlot>.
[11] Silge J, Robinson D (2016). “tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” JOSS, 1(3). doi: 10.21105/joss.00037 (URL: https://doi.org/10.21105/joss.00037), <URL: http://dx.doi.org/10.21105/joss.00037>.
[12] Kovacevic, M. (2010). Review of HDI Critiques and Potential Improvements. Human Development Research Paper, 2010/33. Retrieved 24 May 2021, from https://www.researchgate.net/publication/235945302_Review_of_HDI_Critiques_and_Potential_Improvements_Human_Development_Research_Paper_201033
[13] Human Development Index (HDI) | Human Development Reports. (2021). Retrieved 24 May 2021, from http://hdr.undp.org/en/content/human-development-index-hdi
Credits :
Xiaoyu Tian
Nishtha Arora
Shaohu Chen
Nurlaily Furqandari Suliana
---
title: "Analysis on Human Development Index 2019"
author: "T12_Fri_kable"
output:
flexdashboard::flex_dashboard:
orientation: rows
vertical_layout: fill
source_code: embed
---
```{r Global, message = FALSE, warning= FALSE, echo=FALSE}
knitr::opts_chunk$set(fig.width=15, fig.height=10, fig.align = "center")
```
```{r LoadingLibraries, message=FALSE, warning=FALSE, echo=FALSE}
library(tidyverse)
library(readr)
library(haven)
library(kableExtra)
library(grid)
library(Hmisc)
library(gridExtra)
library(ggResidpanel)
library(cowplot)
library(sjPlot)
library(flexdashboard)
library(tidytext)
library(ggplot2)
library(plotly)
library(countrycode)
```
Part A {data-navmenu="Introduction"}
=====================================
+ The Human Development Index (HDI) is a **statistic composite index of life expectancy, education (mean years of schooling completed and expected years of schooling upon entering the education system), and per capita income indicators, which are used to rank countries into four tiers of human development.**
+ It emphasizes on capabilities of people and makes it as a criteria for **assessing the development of a country**. The data used here, is extracted from [United Nations Development Programme : Human Development Reports](http://hdr.undp.org/en/composite/HDI). The data set show indicative **data for countries with very high, high, moderate and low human development**.
**Project Details**
+ Name: **Analysis on Human Development Index 2019** (Group Project - ETC5510)
+ Objective: To analyze the data set used, by answering **four research questions.**
+ It compares the countries for HDI value, HDI rank(2019, 2018)and SDGs (Sustainable Development Goals) 3,4,5 (2019), where SDG 3 = Life expectancy at birth, SDG 4.3 = Expected years of schooling, SDG 4.4 = Mean years of schooling, SDG 8.5 = Gross national income (GNI) per capita.
**Research Questions**
+ To compute the **summary statistics** for each variable for every Human Development category.
+ To compare and contrast the **'GNI columns' for 'very high' and 'medium development countries'** and to conduct an analysis to assess **whether the very high development countries' translate their income better than the 'medium development countries'** in the areas of human development.
+ Calculate the gap between the ‘expected years of education’ and the ‘average years of education’ for countries with different levels of Human Development, to analyze **if countries with high levels of human civilization are estimated to have high levels of expected education**. Therefore, judge **if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.**
+ Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and **examine their decline or increase in HDI rank from 2018 to 2019.**
###
```{r img1, echo = F, out.width = '3%'}
knitr::include_graphics("images/us.jpg")
```
###
```{r img5, echo = F, out.width = '3%'}
knitr::include_graphics("images/ind.webp")
```
###
```{r img6, echo = F, out.width = '3%'}
knitr::include_graphics("images/money.jpg")
```
Part B {data-navmenu="Introduction"}
=====================================
**Variable Information**
```{r ReadingTidyData, message=FALSE, warning=FALSE, echo=FALSE}
HDI <- read_csv("data/HDIData.csv") %>% rename(Degree_of_Human_Development = y)
```
### **Names**
```{r, echo= FALSE, warning= FALSE, message = FALSE, fig.width= 3}
summary <- colnames(HDI)
knitr::kable(summary, col.names = gsub("[.]", " ", names(summary)))
```
### **Description**
- **HDI_rank_2019**: A **composite index measuring average achievement** in three basic dimensions of human development; a long and healthy life, knowledge and a decent standard of living.
- Country : List of countries for which HDI statistics are calculated.
- **HDI_Value**: **Summary measure of average achievement** in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living.
- **Life_expectancy_at_birth**: Number of years a new-born infant could **expect to live** if prevailing patterns of age-specific mortality rates at the time of birth stay the same throughout the infant’s life.
- **Expected_years_of_schooling**: **No. of years of schooling** that a child of school entrance age **can expect to receive** if prevailing patterns of age-specific enrollment rates persist throughout the child’s life.
- **Mean_years_of_schooling**: **Average number of years of education received** by people ages 25 and older, converted from education attainment levels using official durations of each level.
- **GNI_per_capita**: This is '**Gross National Income' per capita**. This is the **aggregate income of an economy** generated by its production and its ownership of factors of production, less the income paid for the use of factors of production owned by the rest of the world, converted to international dollars using PPP(Purchasing Power Parity) rates, divided by midyear population.
- GNI_per_capita_rank_minus_HDI_rank: Difference in ranking by GNI per capita and by HDI value. A negative value means that the country is better ranked by GNI than by HDI value.
- HDI_rank_2018: Ranking by HDI value for 2018, calculated using the same most recently revised data available in 2020 that were used to calculate HDI values for 2019.
- **Degree_of_Human_Development** : **The cutoff-points are HDI of less than 0.550 for low human development, 0.550–0.699 for medium human development, 0.700–0.799 for high human development and 0.800 or greater for very high human development.**
Question 1
=====================================
Q1. **To compute the summary statistics for each variable for every Human Development category.**
Column {data-width=300}
---------------------------------------------------
### Fig 1.1: The approximate **number of country's** in each degree of human development.
```{r Fig1, fig.height= 4, fig.width=7, aes = FALSE, message=FALSE, warning=FALSE, echo=FALSE}
plot1 <- ggplot(HDI, aes(x = Degree_of_Human_Development,
fill = Degree_of_Human_Development)) +
geom_bar() +
scale_x_discrete(labels = NULL)+
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
(plot.title = element_text(hjust = 0.5)))+
theme_classic()
ggplotly(plot1)
```
### Fig 1.2: Boxplot that summarizes the **statistic descriptive-information** for each variable and degree of human development.
```{r Fig2, fig.height= 3, fig.width=9, echo= FALSE, warning=FALSE, message=FALSE}
HDI_Valueplot <- ggplot(HDI,
aes(y = HDI_Value,
fill = "HDI_Value")) +
labs(x = 'HDI_Value', y = '') +
geom_boxplot(show.legend = FALSE) +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
Life_expectancyplot <- ggplot(HDI,
aes(y = Life_expectancy,
fill = "Life_expectancy")) +
labs(x = 'Life Expectancy', y = '') +
geom_boxplot(show.legend = FALSE) +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
Expected_years_of_schoolingplot <- ggplot(HDI,
aes(y = Expected_years_of_schooling,
fill = "Expected_years_of_schooling")) +
labs(x = 'Expected Year of Schooling', y = '') +
geom_boxplot(show.legend = FALSE) +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
Mean_years_of_schoolingplot <- ggplot(HDI,
aes(y = Mean_years_of_schooling,
fill = "Mean_years_of_schooling")) +
labs(x = 'Mean Year of Schooling', y = '') +
geom_boxplot(show.legend = FALSE) +
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
GNI_per_capitaplot <- ggplot(HDI,
aes(y = GNI_per_capita,
fill = "GNI_per_capita")) +
labs(x = 'GNI per Capita', y = '') +
geom_boxplot(show.legend = FALSE) +
scale_y_continuous(labels = scales::comma)+
theme(axis.text.x = element_blank(),
axis.ticks.x = element_blank())
grid.arrange(HDI_Valueplot,
Expected_years_of_schoolingplot,
Life_expectancyplot,
Mean_years_of_schoolingplot,
GNI_per_capitaplot,
nrow = 1)
```
Column {data-height=300}
-------------------------------------
### Table 1.1: **Descriptive analysis** for each variable based on Degree of Human Development.
```{r Tab1, message = FALSE, warning= FALSE, echo=FALSE}
HDI_long <- pivot_longer(HDI, c(3:7), names_to = "Variable") %>%
group_by(Variable, Degree_of_Human_Development) %>%
summarise(Minimum = round(min(value), digits = 2),
Median = round(median(value), digits = 2),
Mean = round(mean(value), digits = 2),
Maximum = round(max(value), digits = 2),
SD = round(sd(value), digits = 2)) %>%
arrange(Variable, -Maximum)
knitr::kable(HDI_long,
booktabs = TRUE) %>%
kable_styling(full_width = TRUE, bootstrap_options = "bordered") %>%
kable_classic()
```
Part A {data-navmenu="Question 2"}
=====================================
Q2. To compare and contrast the **GNI columns** for **very high and medium development countries** and to conduct an analysis to assess whether the **very high development countries translate their income better than the medium development countries** in the areas of human development.
Column {data-width=400}
-------------------------------------
```{r Q2Filter, message = FALSE, warning= FALSE, echo=FALSE, }
hdi_r <- HDI %>%
select(Country, HDI_rank_2019, HDI_Value, GNI_per_capita, GNI_rank_minus_HDI_rank, Degree_of_Human_Development) %>%
filter(Degree_of_Human_Development %in% c("VERY HIGH HUMAN DEVELOPMENT", "MEDIUM HUMAN DEVELOPMENT")) %>%
mutate(gni_rank_2019 = HDI_rank_2019 + GNI_rank_minus_HDI_rank)
```
### Fig 2.1: Understanding the **distribution of variables**.
```{r Fig3, warning=FALSE, message=FALSE, echo=FALSE, fig.height= 4, fig.width=7}
plot2 <- hdi_r %>% ggplot(aes(x = GNI_per_capita, fill = Degree_of_Human_Development)) +
geom_histogram(bins = 29, alpha = 0.4) +
scale_x_continuous(labels = scales::comma)+
scale_fill_manual(values=c("#E69F00", "#56B4E9"))+
theme_classic()
ggplotly(plot2)
```
### Fig 2.2: **Log-transformation of GNI per capita** for further analysis.
```{r Fig4, warning=FALSE, message=FALSE, echo=FALSE, fig.height= 4, fig.width=7}
lnhdi_r <- hdi_r %>%
mutate(`Log of GNI per capita` = log(GNI_per_capita))
plot3 <- lnhdi_r %>% ggplot(aes(x = `Log of GNI per capita`, fill = Degree_of_Human_Development))+
geom_histogram(bins = 29, alpha = 0.4) +
scale_fill_manual(values=c("#E69F00", "#56B4E9"))+
theme_classic()
ggplotly(plot3)
```
Column {data-height=250}
-------------------------------------
### Table 2.1: **GNI Summary** Statistics.
```{r Tab2, message = FALSE, warning= FALSE, echo=FALSE}
lnhdi_r %>%
group_by(Degree_of_Human_Development) %>%
summarise(min = min(`Log of GNI per capita`, na.rm=TRUE),
q1 = quantile(`Log of GNI per capita`, 0.25, na.rm=TRUE),
median = median(`Log of GNI per capita`, na.rm=TRUE),
q3 = quantile(`Log of GNI per capita`, 0.75, na.rm=TRUE),
max = max(`Log of GNI per capita`, na.rm=TRUE),
mean = mean(`Log of GNI per capita`, na.rm=TRUE),
sd = sd(`Log of GNI per capita`, na.rm=TRUE),
n = n()) %>%
kbl(digits = 1)%>%
kable_material("hover", full_width = T)
```
Part B {data-navmenu="Question 2"}
=====================================
Column {data-height=400}
-------------------------------------
### Fig 2.3: No. of Countries in each degree of **HDI V/S GNI per capita (log)** plot.
```{r Fig5, warning=FALSE, message=FALSE, echo=FALSE, fig.height= 5, fig.width=5}
a1 <- lnhdi_r %>%
ggplot(aes(x = Degree_of_Human_Development , y= (`Log of GNI per capita`))) +
geom_violin(draw_quantiles = c(0.25, 0.5, 0.75),
fill = "lemonchiffon1") +
ylab("GNI per capita") +
xlab("") +
theme_classic()
ggplotly(a1)
```
### Fig 2.4: Comparing the **regression**.
```{r Fig6, message = FALSE, warning= FALSE, echo=FALSE, fig.height= 5, fig.width=7}
plot4 <- lnhdi_r %>%
group_by(Degree_of_Human_Development) %>%
ggplot(aes(x = `Log of GNI per capita`, y = HDI_Value, colour = Degree_of_Human_Development)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
scale_shape_manual(values=c(3, 16, 17))+
scale_color_manual(values=c("Red","Blue"))+
labs(x = "Log of GNI per capita", y = "HDI value") +
theme(legend.position="top")+
theme_classic()+
labs(color = "HDI group")
ggplotly(plot4)
```
Column {data-height=200}
-------------------------------------
### Table 2.2: Values for **very high HDI group**.
```{r, message = FALSE, warning= FALSE, echo=FALSE}
vhigh <- lnhdi_r %>% filter(Degree_of_Human_Development == "VERY HIGH HUMAN DEVELOPMENT")
vmod <- lm(`Log of GNI per capita` ~ HDI_Value, data = vhigh)
tab_model(vmod)
```
### Table 2.3: Values for **medium HDI group**.
```{r Tab3, message = FALSE, warning= FALSE, echo=FALSE}
med <- lnhdi_r %>% filter(Degree_of_Human_Development == "MEDIUM HUMAN DEVELOPMENT")
mmod <- lm(`Log of GNI per capita` ~ HDI_Value, data = med)
tab_model(mmod)
```
Column {data-height=50}
-------------------------------------
There is a **slop of 6.71** for Very High HDI Group and **slop of 6.01** for Medium HDI Group.
Part C {data-navmenu="Question 2"}
=====================================
Column {data-height=300}
-------------------------------------
### Fig 2.5: Diagnostic **Plots for Linear Models**.
```{r Fig7, message = FALSE, warning= FALSE, echo=FALSE, fig.width= 8, fig.height= 5}
resid_compare(models = list(vmod,mmod
),
plots = c("resid", "qq", "hist"),
smoother = TRUE,
qqbands = TRUE,
title.opt = TRUE)+
theme_classic()
```
### Analysis for Q2 Part C.
+ In Fig 2.5, **Left side graphs** represents 'Very High HDI Group' and **Right side graphs** represents 'Medium HDI Group'.
+ From the diagnostic plots, we see **both model are normal distributions** and are a **good fit** to the data.
+ However, from table 2.4 and table 2.5, we see that **both model shows a low coefficient of determination**. This suggests that **GNI is not the only fundamental variables that determines the HDI value.**
+ Therefore, we can conclude that although the very high HDI group performs better in both GNI and HDI than medium HDI group, there is only a **medium correlation between GNI and HDI values to show that very high HDI translates income better.**
Column {data-height=100}
-------------------------------------
### Table 2.4: **Very High HDI Group** Linear Model Values.
```{r Tab4, message = FALSE, warning= FALSE, echo=FALSE}
i <- broom::glance(vmod)
j <- broom::glance(mmod)
knitr::kable(i, align = "c") %>%
kable_material(c("striped", "hover"))
```
Column {data-height=100}
-------------------------------------
### Table 2.5: **Medium Group HDI Group** Linear Model Values.
```{r Tab9, message = FALSE, warning= FALSE, echo=FALSE}
j <- broom::glance(mmod)
knitr::kable(j, align = "c") %>%
kable_material(c("striped", "hover"))
```
Part A {data-navmenu="Question 3"}
=====================================
Column {data-height=100}
---------------------------------------------------
### Q3. Calculate the **gap** between the ‘expected years of education’ and the ‘average years of education’ for countries with different levels of Human Development. Let this gap be called : **Residual years of education**.
We analyze if countries with high levels of human civilization are estimated to have high levels of expected education. Therefore, judge **if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.**
Column {data-height=500}
---------------------------------------------------
```{r Q3Filter, message = FALSE, warning= FALSE, echo=FALSE}
HDI_residual <- HDI %>%
mutate(residual_years_of_schooling = Expected_years_of_schooling - Mean_years_of_schooling) %>% arrange(desc(residual_years_of_schooling))
HDI_residual_col <- HDI_residual %>% mutate(Degree_of_Human_Development = as.factor(Degree_of_Human_Development), new_col = reorder_within(Country, residual_years_of_schooling,Degree_of_Human_Development))
```
### Fig:3.1 **Residual years** of education for **High HDI group (Green)** and **V.High HDI group(Red).**
```{r Fig8, message = FALSE, warning= FALSE, echo=FALSE, fig.height= 17, fig.width=16 }
a <- HDI_residual_col %>%
filter(Degree_of_Human_Development %in% c("VERY HIGH HUMAN DEVELOPMENT", "HIGH HUMAN DEVELOPMENT"))
ggplot(a,
aes(x = residual_years_of_schooling,
y = new_col, fill= Degree_of_Human_Development), show.legend = FALSE) +
geom_col(cex = 5)+
scale_y_reordered() +
facet_wrap(~Degree_of_Human_Development, scales = "free") +
labs( y = "Countries names",
x = "Residual years of education"
) +
theme_classic(base_size = 15)+
theme(legend.position = "none")+
theme(axis.text = element_text(size = 15))+
scale_fill_brewer(palette = "Dark2")
```
### Fig:3.2 **Residual years** of education for **Low HDI group (Light Green)** and **Medium HDI group(Orange).**
```{r Fig, message = FALSE, warning= FALSE, echo=FALSE, fig.height= 17, fig.width=16 }
b <- HDI_residual_col %>%
filter(Degree_of_Human_Development %in% c("MEDIUM HUMAN DEVELOPMENT", "LOW HUMAN DEVELOPMENT"))
ggplot(b,
aes(x = residual_years_of_schooling,
y = new_col, fill= Degree_of_Human_Development), show.legend = FALSE) +
geom_col(cex = 5)+
scale_y_reordered() +
facet_wrap(~Degree_of_Human_Development, scales = "free") +
labs( y = "Countries' names",
x = "Residual years of education"
) +
theme_classic(base_size = 15)+
theme(legend.position = "none")+
theme(axis.text = element_text(size = 15))+
scale_fill_brewer(palette = "Pastel2")
```
Column {data-height=50}
---------------------------------------------------
### Analysis for Q3 Part A.
+ Fig 3.1 and 3.2 show that the residual years of education in **some** countries with high human development is larger than countries with low human development (Kovacevic, 2010).
Part B {data-navmenu="Question 3"}
=====================================
Column {data-height=400}
-------------------------------------
### Table: 3.1 Top 12 Country's with **maximum count of residual years** of schooling for different Degree of Human Development.
```{r Tab5, message = FALSE, warning= FALSE, echo=FALSE, fig.width= 12}
detail_educa_year <- HDI_residual %>% select(Country,Degree_of_Human_Development,residual_years_of_schooling) %>% group_by(Degree_of_Human_Development) %>% arrange(desc(residual_years_of_schooling)) %>% head(12)
knitr::kable(detail_educa_year,
booktabs = TRUE) %>%
kable_styling(bootstrap_options = c("striped", "hold_position")) %>%
kable_material() %>%
row_spec(1, bold = T, color = "Black", background = "Red") %>%
row_spec(2, bold = T, color = "Black", background = "Gray") %>%
row_spec(3, bold = T, color = "Black", background = "Yellow") %>%
row_spec(7, bold = T, color = "Black", background = "orange")
```
Column {data-height=300}
-------------------------------------
### Table 3.2: **Summary of residual years of schooling** (RYS) for different Degree of Human Development
```{r Tab6, message = FALSE, warning= FALSE, echo=FALSE, fig.width=8}
summary_educa_year <-HDI_residual %>%
group_by(Degree_of_Human_Development) %>%
summarise(`Minimum RYS` = min(residual_years_of_schooling, na.rm = TRUE), `Maximum RYS`= max(residual_years_of_schooling, na.rm = TRUE),
`Median RYS` = median(residual_years_of_schooling, na.rm = TRUE)) %>% arrange(`Median RYS`)
knitr::kable(summary_educa_year,
booktabs = TRUE) %>%
kable_styling(bootstrap_options = c("striped", "hold_position")) %>%
kable_material()
```
Column {data-height=50}
---------------------------------------------------
### Analysis for Q3 Part B.
+ Table 3.2 contradicts table 3.1.
Part C {data-navmenu="Question 3"}
=====================================
Column {data-height=400}
-------------------------------------
### Fig: 3.3 **Count of Residual years V/S Country** with respect to different HDI Category's.
```{r Fig9, fig.height= 12, fig.width=20, message = FALSE, warning= FALSE, echo=FALSE}
fig1<- HDI_residual %>%
mutate(residual_years_of_schooling_log = log10(residual_years_of_schooling+20)) %>%
mutate(Degree_of_Human_Development = fct_reorder(as_factor(Degree_of_Human_Development),
residual_years_of_schooling_log,
median, na.rm= TRUE)) %>%
ggplot(aes(x=Degree_of_Human_Development, y = residual_years_of_schooling_log, fill= Degree_of_Human_Development)) +
geom_point(alpha = 20) +
geom_jitter(position=position_jitter(0.2)) +
stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
geom="pointrange")+
xlab("Degree of Human Development") +
ylab("residual years of schooling (log)") +
theme_bw() +
scale_x_discrete(labels = NULL)+
theme(axis.text = element_text(size = 7))
ggplotly(fig1)
```
### Fig: 3.4 **Count of Residual years V/S Country** with respect to different HDI Category's.
```{r Fig12, fig.height= 12, fig.width=20, message = FALSE, warning= FALSE, echo=FALSE}
fig2<-HDI_residual %>%
mutate(residual_years_of_schooling_log = log10(residual_years_of_schooling+20)) %>%
mutate(Degree_of_Human_Development = fct_reorder(as_factor(Degree_of_Human_Development),
residual_years_of_schooling_log,
median, na.rm= TRUE)) %>%
ggplot(aes(x=Degree_of_Human_Development, y = residual_years_of_schooling_log,
fill= Degree_of_Human_Development)) +
geom_point(alpha = 0.2) +
geom_violin(draw_quantiles = c(0.1, 0.25, 0.5)) +
scale_x_discrete(labels = NULL)+
xlab("Degree of Human Development") +
ylab("residual years of schooling (log)") +
theme_classic()
ggplotly(fig2)
```
Column {data-height=50}
-------------------------------------
### Analyisis for Q3 Part C.
From fig 3.3 and fig 3.4, we observe that because there are more countries in 'high HDI group' than 'low HDI group', **and only a small part of 'high HDI group' has excessive differences.**
Part A {data-navmenu="Question 4"}
=====================================
Q4.Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and **examine their decline or increase in HDI rank** from 2018 to 2019.
Column {data-height=150}
---------------------------------------------------
```{r SelectingforQ4, message=FALSE, warning=FALSE, echo=FALSE}
Q4 <- HDI %>%
select(HDI_rank_2019, Country,HDI_rank_2018, Degree_of_Human_Development) %>%
mutate(Difference = if_else(condition = HDI_rank_2019 == HDI_rank_2018,
true = "Same",
false = "Different"))
```
```{r Diff, message=FALSE, warning=FALSE, echo=FALSE, }
Diffr <- Q4 %>%
dplyr::filter(Difference == "Different")
Same <- Q4 %>%
dplyr::filter(Difference == "Same")
```
### Table 4.1: Count of countries with rank difference.
```{r Tab7, message=FALSE, warning=FALSE, echo=FALSE, fig.width=5}
Tab1 <- count(Diffr) %>%
rename(`No. of countries with rank difference` = n )
knitr::kable(Tab1, align = "c") %>%
kable_material()
```
Column {data-height=450}
---------------------------------------------------
### Fig: 4.1 **Rank difference v/s Countries**
```{r Fig10, message=FALSE, warning=FALSE, echo=FALSE, fig.width=7, fig.height=5}
Fig1 <- Diffr %>%
mutate(IncOrDec = if_else(condition = Diffr$HDI_rank_2019 > Diffr$HDI_rank_2018,
true = "Increase",
false = "Decline")) %>%
mutate(DiffValue = (HDI_rank_2019 - HDI_rank_2018))
plot6 <- ggplot(Fig1 , aes( IncOrDec, DiffValue, fill= Degree_of_Human_Development)) +
geom_bar(stat = "Identity") +
theme_classic() +
scale_fill_brewer(palette = "Pastel2")+
theme(axis.title.x = element_blank())
ggplotly(plot6)
```
### Analysis for Q4 Part A.
+ There are **more countries** that have experienced a **decline in rank** from 2018 to 2019 (negative values on graph), than the countries whose rank has gone up.
+ **Higher 'negative values'** are observed for **'high HDI group'** i.e. more decline of rank in High HDI group countries.
+ **Maximum increase** is observed among '**high HDI group**' as well i.e out of countries with an increase in rank, maximum countries are from High HDI group.
Part B {data-navmenu="Question 4"}
=====================================
Column {data-height=50}
---------------------------------------------------
### Table 4.2: No. of countries with same rank
```{r Tab8, message=FALSE, warning=FALSE, echo=FALSE, fig.width=8}
Tab2 <- (count(Q4) - count(Diffr)) %>%
rename(`No. of countries with the same rank` = n )
knitr::kable(Tab2, align = "c") %>%
kable_material()
```
```{r}
VH <- Same %>%
filter(Degree_of_Human_Development == "VERY HIGH HUMAN DEVELOPMENT")
g <- VH[1:5,]
H <- Same %>%
filter(Degree_of_Human_Development == "HIGH HUMAN DEVELOPMENT")
i <- H[1:5,]
M <- Same %>%
filter(Degree_of_Human_Development == "MEDIUM HUMAN DEVELOPMENT")
j <- M[1:5,]
L<- Same %>%
filter(Degree_of_Human_Development == "LOW HUMAN DEVELOPMENT")
q <- L[1:5,]
k <- full_join(g,i)
s <- full_join(k,j)
scatter <- full_join(s,q)
```
Column {data-height=250}
---------------------------------------------------
### Fig 4.2: **Top 5** ranking countries (**rank maintained** form 2018 to 2019) from **each HDI Category**.
```{r Fig11, warning=FALSE, echo=FALSE, fig.height=9, fig.width=25}
df <- scatter
fig <- df %>%
plot_ly(
y = ~Country,
x = ~HDI_rank_2019,
color = ~Degree_of_Human_Development,
frame = ~Degree_of_Human_Development,
type = 'scatter',
mode = 'markers'
)
hide_legend(fig)
fig <- fig %>%
animation_opts(
1000, easing = "elastic", redraw = FALSE
)
```
###
Analysis for Q4 Part B.
+ The Lowest rank is for **Norway** for both the years, which **belongs to very high HDI group.**
Part A {data-navmenu="Conclusion"}
=====================================
- **Most** countries fall under *Very High Human Development* and **few** under *Low Human Development*.
- HDI value, life expectancy, and mean year of schooling have more values as compared to 'expected year of schooling' and GNI. Also **outlier data only exist in variable Expected year of schooling and GNI per capita.**
- **Higher income** will result in a **high human development group**, which means a **higher HDI value**.
- **Very High HDI Group** translates **income better.**
- It is **accurate** to use the **‘average value of expected years of education and average years of education’ for the calculation of HDI.**
- There are **112 countries** that have a **different HDI rank** for 2018 and 2019. Out of which, **more** countries have had a **decline** in rank from 2018 to 2019, than an increase.
- There are **77 countries** that have the **same rank** for 2018 and 2019, out of which, **Norway has maintained its 1st rank overall.**
Column {data-height=300}
---------------------------------------------------
```{r}
HDI$iso3 <- countrycode(HDI$Country, 'country.name', 'iso3c')
HDI[HDI$Country=="Kosovo","iso3"] <- "XKX"
```
```{r map, fig.width=5, fig.height=5, echo=FALSE, warning=FALSE, message=FALSE, out.width= '60%'}
fig <- plot_ly(HDI, type='choropleth', locations= HDI$iso3 , z= HDI$HDI_rank_2019, text= ~paste(
"
Country: ", Country,
"
HDI rank 2019:", HDI_rank_2019), colorscale="Viridis")
fig <- fig %>% colorbar(title = "HDI Rank" )
fig <- fig %>% layout(
title = 'HDI Country ranks 2019')
fig
```
###
```{r img13, echo = F, out.width = '3%'}
knitr::include_graphics("images/g.png")
```
Part B {data-navmenu="Conclusion"}
=====================================
**References**
[1]Wickham et al., (2019). Welcome to the tidyverse. Journal of Open
Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686
[2] Hadley Wickham and Jim Hester (2020). readr: Read Rectangular Text
Data. R package version 1.4.0.
https://CRAN.R-project.org/package=readr
[3] Hadley Wickham and Evan Miller (2020). haven: Import and Export
'SPSS', 'Stata' and 'SAS' Files. R package version 2.3.1.
https://CRAN.R-project.org/package=haven
[4] Hao Zhu (2021). kableExtra: Construct Complex Table with 'kable'
and Pipe Syntax. R package version 1.3.4.
https://CRAN.R-project.org/package=kableExtra
[5] R Core Team (2021). R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna, Austria.
URL https://www.R-project.org/.
[6] Frank E Harrell Jr, with contributions from Charles Dupont and many
others. (2021). Hmisc: Harrell Miscellaneous. R package version
4.5-0. https://CRAN.R-project.org/package=Hmisc
[7] Baptiste Auguie (2017). gridExtra: Miscellaneous Functions for
"Grid" Graphics. R package version 2.3.
https://CRAN.R-project.org/package=gridExtra
[8] Katherine Goode and Kathleen Rey (2019). ggResidpanel: Panels and
Interactive Versions of Diagnostic Plots using 'ggplot2'. R package
version 0.3.0. https://CRAN.R-project.org/package=ggResidpanel
[9] Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot
Annotations for 'ggplot2'. R package version 1.1.1.
https://CRAN.R-project.org/package=cowplot
[10] Lüdecke D (2021). _sjPlot: Data Visualization for Statistics in
Social Science_. R package version 2.8.7, .
[11] Silge J, Robinson D (2016). “tidytext: Text Mining and Analysis Using Tidy Data
Principles in R.” _JOSS_, *1*(3). doi: 10.21105/joss.00037 (URL:
https://doi.org/10.21105/joss.00037), .
[12] Kovacevic, M. (2010). Review of HDI Critiques and Potential Improvements. Human Development Research Paper, 2010/33. Retrieved 24 May 2021, from https://www.researchgate.net/publication/235945302_Review_of_HDI_Critiques_and_Potential_Improvements_Human_Development_Research_Paper_201033
[13] Human Development Index (HDI) | Human Development Reports. (2021). Retrieved 24 May 2021, from http://hdr.undp.org/en/content/human-development-index-hdi
Part C {data-navmenu="Conclusion"}
=====================================
Row {data-width=600}
-------------------------------------
###
```{r img9, echo = F, out.width = '5%'}
knitr::include_graphics("images/thanks.jpg")
```
###
```{r img10, echo = F, out.width = '3%'}
knitr::include_graphics("images/qu.jpg")
```
Column {data-height=300}
-------------------------------------
###
**Credits** :
- Xiaoyu Tian
- Nishtha Arora
- Shaohu Chen
- Nurlaily Furqandari Suliana
###
```{r img11, echo = F, out.width = '0.1%'}
knitr::include_graphics("images/mon.png")
```